The development of the spoken corpus of Japanese learner English
نویسندگان
چکیده
1. Introduction To keep up with the information-driven society, it must be one of the most important things to acquire foreign languages, especially English for international communications. In order to develop a computer-assisted language teaching and learning environment, we have been compiling a large-scale speech corpus of Japanese learner English, which provides a lot of useful information to construct a model of the developmental stages of Japanese learners' speaking ability. In this paper, first, we will present the overview of this project by introducing the activities done so far such as its data collection procedure, annotation schemes including error tagging, and the development of the original software tool which makes data collection process easier. Secondly, we will explain how this corpus can be exploited for second language acquisition research or system development by introducing several experiments we have done so far, such as a linguistic analysis on how the English article system is acquired by Japanese learners of different proficiency levels and the attempts to automatic processing of learners' language in collaboration with Natural Language Processing (NLP) techniques.
منابع مشابه
Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners
Hedges, as tools to express tentativeness and doubt, have been studied in plenty of research papers in the Iranian EFL research setting. However, their use in a learner corpus, portraying Iranian learner English, is in need of more research attention. With this end in view, this study aimed at investigating how Iranian EFL learners who have majored in English-related fields in Iran deployed hed...
متن کاملMetadiscourse Markers in a Corpus of Learner Language: The Case of Iranian EFL Learners
Different issues have been probed in learner corpus research since the late 1980s.However, taking the im- portance of meta discourse markers (MDMs) in signposting academic discourse, their use in Iranian EFL learners‟ academic essays is an area of research in need of a more serious analysis. Contributing to this line of investigation, this paper reports a corpus-based study of the use of MDMs i...
متن کاملSpoken English Learner Corpora
In this paper we present a survey of some most significant spoken English learner corpora created up to date. Spoken learner corpora which include speech generated by learners are important in many areas of research and practice, in particular, for identifying typical pronunciation errors of learners of English as a second language (ESL), English as a foreign language (EFL), or English as a lin...
متن کاملThe Effect of CMC in Business Emails in Lingua Franca: Discourse Features and Misunderstandings
The paper argues that everyday exchange of business emails produces a development in the work-group relationship, which, in turn, makes new communication styles possible and acceptable by the users' habit to computer-mediated forms, even in unbalanced professional exchanges. The focus is on the (spoken) discourse features of email messages in a self-compiled corpus of selected computer-mediated...
متن کاملHow textbooks (and learners) get it wrong: A corpus study of modal auxiliary verbs
Many elements contribute to the relative difficulty in acquiring specific aspects of English as a foreign language (Goldschneider & DeKeyser, 2001). Modal auxiliary verbs (e.g. could, might), are examples of a structure that is difficult for many learners. Not only are they particularly complex semantically, but especially in the Malaysian context ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003